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ABSTRACT 

There  is  loss  of  efficiency  when  an  estimated  noise  covariance  matrix  is 
used  in  the  place  of  the  unknown  true  noise  covariance  matrix  in  the  construc¬ 
tion  of  the  optimum  filter  for  signal  detection.  In  the  case  of  detecting  a 
single  signal  specified  by  a  real  or  a  complex  vector,  we  investigate  the  extent 
of  this  loss  by  obtaining  an  exact  confidence  bound  for  the  realized  signal  to 
noise  ratio.  We  also  give  an  estimate  of  this  ratio  which  is  useful  in  optimum 
selection  of  features.  Some  of  these  results  are  extended  to  the  case  of  dis¬ 
crimination  between  a  number  of  given  signals. 


1 .  INTRODUCTION 


Reed,  Mallet  and  Brennan  (1974)  studied  the  loss  of  power  in  signal  detec¬ 
tion  when  the  noise  covariance  matrix  is  unknown  and  the  estimated  matrix  from 
sampled  data  on  noise  is  used  in  the  construction  of  the  optimum  filter  or  the 
linear  discriminant  function.  This  was  done  by  computing  the  expected  value 
of  the  signal  to  noise  ratio  based  on  the  estimated  filter  and  comparing  it  with 
the  corresponding  ratio  when  the  covariance  matrix  is  known.  In  this  paper, 
we  extend  the  study  of  the  above  authors  in  several  directions. 

An  exact  confidence  bound  is  provided  for  the  realized  signal  to  noise  ra¬ 
tio  when  an  estimated  filter  is  used.  A  test  is  given  for  examining  whether 
a  given  set  of  features  is  sufficient  for  signal  detection.  A  criterion  is  pro¬ 
vided  for  optimum  selection  of  features.  Finally,  the  problem  of  discrimination 
with  multiple  alternative  signals  is  discussed.  We  consider  both  the  cases  where 
the  signal  is  represented  by  a  real  or  a  complex  vector. 

The  following  notations  are  used.  A'  denotes  the  transpose  of  a  matrix 
A  when  its  elements  are  real  and  A*  the  conjugate  transpose  of  A  when  its  ele¬ 
ments  are  complex. 

i)  X  -  N  (u,E),  i.e.,  a  real  p-vector  X  has  a  p-variate  real  normal 
P 

distribution  with  the  probability  density  function  (p.d.f.) 

( 211)  ~P^2 1 1 1  %  exp  (x-u)  'I  1(x-u)].  (1.1) 

ii)  X  -  N  (p,E),  i.e.,  a  complex  vector  X  has  a  p-variate  complex  nor- 
P 

mal  distribution  with  the  p.d.f. 

(n)~pjl| -1ex£  t-(x-y)*E  1(x-u)].  (1.2) 

iii)  Y  «  N  (M, E , V) ,  i.e.,  a  real  r  x  s  matrix  Y  has  the  p.d.f. 
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(2n)  rS/,2|z|~S'/2|v|'r/,2exp  t-JstrZ“1(Y-M)V~1(Y-M)  ’]  . 


(1.3) 


iv)  Y  -  N^g(M,Z,V),  i.e.,  a  complex  r  x  s  matrix  has  the  p.d.f. 


(n)“  |r[“S|vj_rexp  t-trZ_1(Y-M)V-1(Y-M)*]  . 


(1.4) 


v)  S  ~  W  (f,Z),  i.e.,  a  real  p  x  p  positive  definite  matrix  S  has  the 
Wishart  distribution  on  f  degrees  of  freedom  with  the  p.d.f. 


2-pf  72  t  y  f /2)  ]-l  |  z  | -f  n  I  s  j  (f-p-1)  ^expC-JstrZ-^) 


(1.5) 


where 


r  (a)  -  np<p"1)/4  5  u-ii). 
p  i=l  z 


vi)  S  ~  W  (f.I),  i.e.,  a  complex  p  x  p  positive  definite  matrix  S  has 
the  complex  Wishart  distribution  with  the  p.d.f. 


t  Tp(f ) l-1 1 Z [  f |v|f-pexp(-trZ_1S) 


(1.6) 


where 


r  (a)  =  np(p_1)/2  n  (a-i+i) 
P  i=l 


vii)  S  ~  W®(f,Z),  i.e.,  a  real  p  x  p  positive  definite  matrix  has  the 
p.d.f. 


!"f/2|s|  (f"P'2)/2g(-55trZ"1S) 


(1.7) 


viii)  S  ~  W°(f,Z),  i.e.,  a  complex  p  x  p  positive  definite  matrix  S  has 
the  p.d.f. 


[  E | -f  f  S ( f-pg(-trZ_1S) . 


(1.8) 
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2.  SOME  MULTIVARIATE  DISTRIBUTIONS 

In  this  section  we  derive  some  new  multivariate  distributions  which  arise 
in  the  study  of  problems  of  signal  detection.  The  actual  applications  are  dis¬ 
cussed  in  Section  3. 

Consider  the  p  x  p  positive  definite  (p.d.)  matrices 


(2.1) 


partitioned  by  the  first  r  and  the  rest  s  =  p  -  r  of  rows  and  columns,  the  Schur 
complements  of  order  r  x  r 


S1.2  =  S11  "  S12S22S21»  E1.2  "  E11  ‘  E12E22E21 


(2.2) 


and  the  regression  coefficients  of  order  r  x  s 


b  =  S,„S  *  B  =  Z , „ Z _ ^ . 
12  22  ’  12  22 


(2.3) 


We  have  the  following  lemmas  which  follow  on  standard  lines  (see  Rao  (1973,  pp. 

538-539)  and  Srivastava  and  Khatri  (1979,  p.  79)). 

Lemma  1.  Let  S  ~  W  (f,I)  where  p  =  r  +  s  and  S..,S..  „,b  and  B  be  as 

.  p  i  j  l .  z  i  .  z 

defined  in  (2.1)— (2.3) .  Then  the  following  hold: 

1.)  S^  2  an<*  Cb ,S22^  are  independently  distributed  with 


S1 .2  '  Vf"3’  E1.2) 


(2.4) 


S22  '  Ws(f’  E22) 


(2.5) 


and  the  conditional  distribution  of  the  r  x  s  matrix  b  given  is 
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b  '  Nr,s(6-ri.2'S22  > 


(2.6) 


2.)  The  unconditional  (marginal)  p.d.f.  of  b  obtained  by  integrating  over 


S22  is 


nrs/2  (f) 
s  l 


E22 1 "f 72  I S1.2 i "S/2 I E22  +(b_B) 'll]2 (b-B) ! "(f+r) 11 


which  we  denote  by 


Tr,.<8-f-I1.2-!:22>- 


(2.7) 


-h  h  h  h 

If  b^  =  2  (b-8)Z22»  where  2  and  represent  symmetric  square 


roots ,  then 


b.  ~  T  (o,f ,1  ,1  )  . 
1  r,s  '  *  r*  s 


(2.8) 


3.)  If  u  =  (I  +bjbp  b^  =  bpig+bjbp”,  then  the  Jacobian  of  the  trans¬ 
formation  from  b.,  to  u  is  ll  — UU  ’  f  ^r+S+b^E  and  hence  the  p.d.f.  of 
1  r 

u,  derived  from  (2.8),  is 


r  (— ) 

|!  -UU'|»-S-1)/2 

ncs/2r  (f)  r 

s  z 


(2.9) 


which  we  denote  by 


Ur,s(¥> 


(2.10) 


.)  If  s  >  r,  the  p.d.f.  of  B  =  (I  +b^bp  ,  derived  on  standard  lines,  is 
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ut<£tri  •  f)r1lB|(f-*-1)/2|i-B|(*-r-1>/2 


where 


r  (a)T  (b) 

Va'b)  ■  Va+b)  • 


which  is  the  r-variate  beta  distribution  denoted  by 


R  (  f+r~s  2  -v 
Br  (  2  *  2  ' * 


(2.11) 


If  b2  =  (b-8)£22»  then  the  p.d.f.  of  BQ  =  (1^  2+B2bP  *  is 


pr<  )r1|El.2i(£-1)/2|B0i'f-s-1>/2|i:-b2-B0i<=-'-1>/2 


which  will  be  referred  to  as 


R  (  f+r-s  2.  y-1 

r  2  *  2 9  ^1 . 2  ' 


(2.12) 


Lemma  2.  If  S  -  W^(f,Z)  where  p  =  r  +  s,  then  2  and  (b^S^)  are  inde¬ 
pendently  distributed,  and  the  distributions  of  the  various  statistics  considered 
in  Lemma  1  are  as  follows. 


S1 .2  '  Wr(f_r,Z1.2)  »  S22  '  V's(f,Z22)* 


(2.13) 


The  conditional  distribution  of  b  given  S79  is 


b  *  Nr,s(6’h.2,S22) 


(2.14) 


2.)  The  marginal  distribution  of  b  is 


b  '  Tr,s(^,£,21.2,‘'22^ 
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with  the  p.d.f, 


T  (f+r)  .  . 

^1.2^22  +  (b-8)  ^!2<»-B)|-(f+r) 


(2.15) 


bl  =  Ei;2(b-6)E22  '  Vs^’W 


(2.16) 


3.)  If  u  =  b^(Is+b*b^)  ^  =  (Ir+t»-^b*>  then  its  p.d.f.  is 


r  (f+r)  f  = 

-fr—  !  i  -ud*  i £_ 

nrsr  (f)  r 

s 


which  is  denoted  by 


u  ~  U  (f+r) . 
r,s 


(2.17) 


4.)  If  s  >  r,  the  p.d.f.  of  B  =  (I  +b,  bf)  ,  derived  on  standard  lines,  is 
-  -  r  1  1 


[  6r(f+r-s,s)]'1|B|f‘S|l-B|S"r 


which  will  be  referred  to  as  r  variate  complex  beta  distribution 


Br(f+r-s ,s) . 


(2.18) 


Writing  b^  =  (b-B)!^*  c^e  P>d.f.  of  B^  =  (1^  2+b2b2^  obtained  by 


a  transformation  from  (2.15)  is 


[  Br(f+r-s,s)]'1|EU2|flB0|f-s!l^2  -  B0!s‘r 


which  will  be  referred  to  as 


8r(f+r-s,s;  2^  2) 


(2.19) 


3 .  MAIN  THEOREMS 


In  this  section,  we  use  the  results  of  Section  2  to  derive  distributions 
of  some  functions  of  a  p  x  r  matrix  A  whose  columns  represent  given  signals  and 

f  ^S  the  estimated  noise  covariance  matrix  of  order  p  x  p.  These  distributions 
are  used  in  the  next  section  for  drawing  inferences  on  the  basis  of  observed 
data  in  signal  detection.  First  we  consider  the  real  case  and  quote  the  corre¬ 
sponding  results  for  the  complex  case  in  the  remarks  following  the  theorems. 

Theorem  1.  Let  A  be  a  p  x  r  given  matrix  of  rank  r  (<_  p/2)  and  S  -  W^(f ,1) 
Define  the  r  x  r  matrices 


=  (A'S“1A)'1,  IA  =  (A'E  1A)~1 


(3.1) 


B  "  ZA  S-Vs-^A)-^-1!^. 


(3.2) 


Then  S  and  B  are  independently  distributed  with 


SA  -  Wr(f-p+r,  :A) 


(3.3) 


B  .  Br(^±p,  f) 


(3.4) 


where  the  B  distribution  is  as  defined  in  (2.11)  and  s  =  p  -  r. 
r  r 

Proof .  Let  l  be  a  p  x  s  matrix  of  rank  s(=p-r)  such  that  A^ 
is  nonsingular  and  A^A  =  0.  Then  A^SAq  ~  W  (f^A^IA^)  .  Writing 


(A  :  Ax) 


AoSAo  = 


11 

< 
J— * 

9 

AoZAo  = 

6 11 

21 

CM 

CM 

> 

?21 

V1.2  V11  V12V22V21’  91.2  911  "  ei2622921 


-U  -1  -1  k 

b  =  P  2  (V  V  -  9  9  )92 

1  1.2'  12  22  12  22;  22 


(3.5) 
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and  using  Lemma  1 


'l.2  -  Vf-P+r>  ei.2) 


(3.6) 


(I+bjb*)-1  -  |). 


(3.7) 


^-1 


Further  ^  and  (I+b^bp  are  independently  distributed.  Now 


Vx  2  -  A'SA  -  A'SA  (A^SA^AjSA 


=  A'AS.A'A 
A 


0.  ,  -  A'AZ  I'A 
1.2  A 


Then  from  (3.6),  =  (A' A)  ^  has  the  desired  distribution  (3.3). 


Further,  using  the  formula 

..  VI 

'11 


0  0 


0  V 


-1 

22 


i_V22  V21 1 


VU2  (I  :  -V12V22> 


we  find,  after  some  computations,  that  B  as  defined  in  (3.2)  is  the  same  as 
(I  +  bjbp  ^  with  as  in  (3.5).  Then  (3.7)  establishes  (3.4).  Theorem  1 
is  proved . 

Remark  1 ■  If  S  -  W®(f,E)  as  defined  in  (1.7),  then  and  B  as  defined 
in  Theorem  1,  (3.1)  and  (3.2),  are  independently  distributed.  Further,  B  has 
the  same  p.d.f.  (3.4)  as  in  Theorem  1  independently  of  g,  while  the  same  is 
not  true  for  S. . 


Remark  2 .  Let  A  be  a  p  x  r  complex  matrix  of  rank  r(<_  p/2)  and  S  ~  W  (f ,!!) 


Then 


(3.8) 


=  (A  S  A)  -  Wr(f-p+r,  EJ 


L  _1  *  _1  _1  -1  u 

B  =  ZA  SA  (A  s  ZS  A)  Sa  EA  '  Br(f+r"s»s) 


(3.9) 


Further  and  B  are  independently  distributed. 

Remark  3 .  If  S  is  complex  and  has  the  distribution  W®(f,E),  then  SA  and 
B  as  defined  in  (3. 8, 3. 9)  are  independently  distributed.  Further,  the  distri¬ 
bution  of  B  is  as  in  (3.9)  independently  of  g,  while  the  same  is  not  true  for 


Theorem  2.  Let  B  be  p  x  p  positive  definite  matrix  such  that 

fl  f  2 

B  ~  B  <  -s-,  ■==■;  A)  ,  0  <  B  <  A. 

P  1  1  ~  ~ 


Consider  the  partitions 


A11  A12 


A2!  A2 


B11  B12 


B21  B22 


where  A,,  and  B, ,  are  r  x  r  matrices,  and  the  Schur  complements  A„  .  and  B„  . . 
11  11  v  2*1  2*1 


Then  the  statistics  B.^,  B0  .  ^  and 


0  *  <A2-1  -  W'Xi  -  a21aubu>  lBn  +  (An  - 


are  independently  distributed.  Further 
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U  -  U  (■*=■)  as  in  (2.10)  • 
p-r,r  2 


(3.13) 


The  results  of  Theorem  2  were  established  by  Khatri  and  Pillai  (1965)  when 
Ip.  Their  proof  can  be  easily  extended  to  our  case  by  noting  that 

|A|  =  I  ^1 1  I  A2  ■  1  ■  *  =  IBni  !b2.iI 


|a-b!  -  !a11-bu|  !a2.1-b2.1[  |i  -Ira*! 


and  then  computing  the  necessary  Jacobians  of  the  transformations. 

Remark  4 .  In  the  complex  case,  let  B  and  A-B  be  Hermitian  positive  defi¬ 
nite  matrices  such  that 


B  -  B  (f.  ,f0 ;  A)  . 
P  12 


Then,  B.^,  B^^  and  U  as  defined  in  Theorem  2  are  independently  distributed. 
Further 


B11  ~  Br(fl’f2;  All)f 


(3.14) 


B2-i  '  Bp_r(f i-r >f 2 ;  A2-i^» 


(3.15) 


U  -  U  (f_)  as  in  (2.17)  . 
p-r , r  2 


(3.16) 


Theorem  3 .  Let  X  and  Y  be  independent  univariate  gamma,  G(l,m),  and  beta  , 
B^(m-c+l,a),  variables  witv'  the  p.d.f.’s 


1  -X  m-1  n  ^  n 

e  x  x  >  0,  m  >  0 


(3.17) 
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8 (m-c+1  ,a) 


ym  C(l-y)a  \  0<y<l,a>0,  m-c+1  >  0. 


(3.18) 


Then  the  p.d.f.  of  Z  =  XY  is 


e  2zm  ^  r (a+m-c+1) 
r(m)  T (m-c+1) 


¥(a,c;  z) 


(3.19) 


where  ¥  is  the  confluent  hypergeometric  function  of  the  second  kind  defined  by 


f  " 

¥(a,c;  z)  =  --^y j  ta  1(l+t)C  a  1exp(-zt)  dt 


(3.20) 


(see  Erdelyi  et  al  (1953,  p.  255)  or  Lebedev  (1972,  p.  268)). 

Proof.  The  result  is  obtained  by  writing  the  joint  distribution  of  X  and 
Y  and  making  the  transformation 


Z  =  Xt,  t  =  Y/(l-Y) . 


Remark  5.  The  function  ¥  (a,c;  z)  exists  for  all  a  and  c  and  has  the  fol¬ 
lowing  representations  in  infinite  series 


'!'(a’C’z)  =  ra+a-'cT  lFl(a’c;  z)  +  r-rtaf”  zl_C  iF1d+a-c,2-c;  z) 


provided  c  ¥  0,±1, ±2, . . .and  T(c+1)  =  cT( c)  for  any  c  ¥  0,±1,.... 

,  v  k 

n  °°  (a)  z 

¥(a,n+l,z)  =  l  k!(^-x)-,-»  ft(a+k)  -  y(l+k)  -  Y(n+l+k)  +  log  z] 

k=0 


1  nfX  (-X)  (n-k-l)!(a-n)k  k_n 
T(a)  kiQ  k! 


if  n  =  0,1,2,...  and  a  ¥  0,-1, -2,...,  where  y(x)  =  F'(x)/r(x),  and  the  last  term 


is  zero  if  n  =  C. 
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If  a  =  -m,  m  =  0,1,...,  and  c  =  n  +  1,  n  =  0,1,...,  then 


where 


n-m,n+l;z)  -  (-D®  ■  ^(-m.n+ljz) 


1F1(a,c;z)  -  r(a)r(c_a) 


zt  a-1,.  .vC-a-lj4. 
e  t  (1-t)  dt, 


4.  TESTS  FOR  ADDITIONAL  INFORMATION 

Let  us  consider  the  case  of  discrimination  of  a  given  signal  from  pure  noise. 
A  question  of  some  practical  importance  is  the  number  of  features  to  be  measured. 
Let  us  consider  a  signal  6  with  p  =  r  +  s  features  and  an  estimate  f-1S  of  the 
unknown  I  based  on  f  degrees  of  freedom  (or  f  samples  from  noise  process)  in 
partitioned  forms 


,  E  = 


,  S  = 


(4.1) 


where  <5^  is  an  r-vector,  ^  is  an  s-vector,  is  an  r  x  r  matrix  and  so  on. 
The  signal  to  noise  ratio  based  on  5  (all  the  features)  is  5'E  ^6  while  that 
based  on  6^  is  ^2^22^2"  ^1  re^un(^ant»  then 


0  =  S'S*l«  -  S'I^52 


(«1-S52)'r"^1(«1-8S2),  s  -  e12e“2 


(4.2) 


which  implies  that  5^  =  862.  We  develop  a  test  of  the  null  hypothesis 


V  51  =  662 


(4.3) 
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on  the  basis  of  the  information  provided  by  S. 

We  first  consider  the  case  where  6  and  S  are  real.  From  Lemma  1,  S,  „  and 

1*2 

b  =  are  independently  distributed  for  given  S22  with 

S1  •  2  ~  Wr^f-s,Zl*2')  » 


b  '  Nr,s(e*E1.2«S22) 


(4.4) 


Then  from  the  standard  MANOVA  theory  (see  Rao  (1973,  pp .  547-550)  and  Srivastava 
and  Khatri  (1979,  pp.  166-172)),  the  test  statistic  for  testing  in  (4.3)  is 


t2  la+i  '^.2»1-m2) 

S2S22S2 


(4.5) 


2 

which  has  Hotelling's  T  or  F  distribution  on  r  and  (f-p+1)  degrees  of  freedom. 
An  alternative  way  of  computing  (4.5)  is 


T2  .  _u_ 

1  52S22S2 


(4.6) 


The  test  (4.5)  is  important  since  in  practical  applications  with  an  estimated 
covariance  matrix,  inclusion  of  too  many  features  may  reduce  the  power  of  dis¬ 
crimination  (see  Rao  (1971)). 

Let  us  consider  the  case  of  k  signals  represented  by  the  columns  of  a  p  x  k 
matrix  A.  Writing 


(4.7) 


where  A^  is  r  x  k  matrix,  we  ask  the  question  whether  A^  is  redundant.  The  test 


for  this  again  follows  from  the  general  MANOVA  theory  (see  Rao  (1973,  pp.  547-550) 
and  Srivastava  and  Khatri  (1979,  pp.  166-172)).  The  likelihood  ratio  test  gives 


the  A  criterion 


A  = 


1*2' 


Sl-2+(ArbA2)  <A2S2lA2)_1(Al'bA2)  ' 


22  1 


S+AA  ' 


lS22+A2A2' 


(4.8) 


which  is  distributed  as 


A(r ,f-s ,k) . 


(4.9) 


Several  approximations  for  computing  the  significance  of  an  observed  value  of 
A  are  described  in  Rao  (1973,  pp.  555-556)  and  Srivastava  and  Khatri  (1979, 
pp.  176-186). 

Remark  6.  When  S  has  complex  Wishart  distribution,  the  corresponding  test 
for  Hq  :  is 


2  _  f-p+1  ^l^V  Sl-2(lSl  b<S2) 
L  r  *  -1 

1  P  r*  A  1 


<52S22l52 


(4.10) 


2 

which  has  complex  Hotelling's  T  or  F-distribut ion  with  2r  and  2 (f-p+1)  degrees 
of  freedom.  An  alternative  way  of  computing  (4.10)  is 


r2  =  1=2+1 
r 


5*S~1S 

£2S2252 


II. 


For  the  case  of  k  signals  represented  by  the  columns  of  a  p  x  k  matrix 


A  = 


*  -1 

the  likelihood  ratio  test  for  H^:A 2^22A2 
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A  =  IS1>2|/|S1>2+(A1-bA2)(Ajs21A2)"1(A1-bA2)*[ 

-  ( | S | / | S+AA*| )  t  (|S22|/|S22+A2A*|) .  (4.11) 

which  is  distributed  as 

A(2r ,  2 (f-r)  ,  k) . 

5.  LOSS  DUE  TO  ESTIMATION  OF  E  IN  DETECTING  A  SIGNAL 

If  I,  the  noise  covariance  matrix,  is  known,  then  the  optimum  filter  for  the 

-1  *  -1 

detection  of  a  signal  5  is  6'Z  X  (or  5  E  X)  when  X  is  a  real  (or  a  complex) 
vector  observation.  1  [  In  the  sequel  we  consider  both  the  real  and  complex  cases 
indicating  the  expressions  for  the  complex  case  within  brackets  as  above] .  The 

signal  to  noise  ratio,  which  is  an  index  of  the  efficiency  of  discrimination, 

-1  *  -1  -1 
in  such  a  case  is  <S'E  6  (or  6  E  5).  If  E  is  not  known  but  an  estimate  f  S 

based  on  f  degrees  of  freedom  is  available,  we  may  use  the  estimated  filter 
-1  *  -1 

f5'S  X  (or  f<5  S  X).  The  signal  to  noise  ratio  in  such  a  case  is 

/c  n  _  (S’S'^)2  n  _  (<5*S~15)2 

p(S,E)  _  i  ,  ov  p(S,E)  ^  1  1 

<5’S  ES  5  6  S  ES  6 

By  the  Cauchy-Schwartz  inequality  this  is  less  than 

<$'E_16  (or  5*E-1<5)  (5.2) 

so  that  there  is  loss  of  information  in  using  f  instead  of  E. 

The  efficiency  of  the  estimated  filter  can  be  examined  by  considering  the 


(5.1) 


ratio  of  (5.1)  to  (5.2) 
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3  = 


(s's-1^)2 


(6  'E-1<$)  (6  ’<$-1£S-1<5)  * 


or 


,  *  ? 
(6  S  6) 


\ 


,  *  _i  *  -1  -1 

(5  Z  6)  (<S  S  ES  6) 


/ 


(5.3) 


Using  Theorem  1,  (3.4),  by  putting  r  =  1  and  s  =  p-1,  the  distribution  of  (5.3) 
is  obtained  as  univariate  beta 


ET‘)»  (or  Vf_P+2’  P-D>-  (5.4) 

The  distribution  (5.4)  in  the  complex  case  was  earlier  obtained  by  Reed,  Mallett 
and  Brennan  (1974).  By  computing  the  expected  value  of  the  distribution,  they 
provided  the  rule  f  =  2p  for  maintaining  an  average  loss  ratio  of  better  than 
half.  But  the  distributions  (5.4)  can  be  used  in  other  ways.  For  instance, 
by  using  incomplete  beta  tables  one  can  determine  the  value  of  f,  the  number 
of  samples  on  noise  for  estimating  Z,  to  ensure  for  any  given  p  an  efficiency 
larger  than  any  given  value  with  an  assigned  probability. 

The  signal  to  noise  ratio  (5.1)  for  any  realized  value  S  depends  on  the 
unknown  Z,  which  makes  it  difficult  to  assess  the  performance  of  any  particular 
estimated  filter.  We  suggest  two  ways  of  drawing  inference  on  (5.1)  in  terms 
of  known  quantities. 

First,  we  may  find  a  constant  c  such  that 

E  t  p(S,Z)-cf5’s"15]2,  (or  E[  P(S,Z)-  cf6*S_1<S]2)  (5.5) 

is  a  minimum.  The  optimum  c  is 

E  [  p(S,Z)  •6,S~16]  I  E  [ p(S,Z)  •  6*S~1<S ] 

79  9  I *  —7  9 

fE(6fS  6)  1  fE(6  S  6) 

which  is  easily  evaluated  using  the  independence  of  p(S,Z)  and  5 ’ S  ^5  (or  o(S,Z) 
and  <$  S  ^5)  and  the  distributions  derived  in  Theorem  1,  (3.3)  and  (3.4)  or  (3.8) 
and  (3.9),  by  choosing  r  =  1  and  s  =  p-1.  The  value  of  c  turns  out  to  be 


(5.6) 


(f-p+2) (f-p-3) 
f(f+l) 


(5.7) 


2  -1 

in  either  case.  Then  defining  the  estimated  Mahalanobis  distance  D  =  fS'S  5 

P 

£  —1 

(or  f6  S  S)  ,  we  can  use  the  known  quantity 


(f-p+2) (f-p-3)  2  (  2zl)(1  _  £±1)d2 

f(f+l)  p  u  f+lM1  f  ;Dp 


(5.8) 


as  an  approximation  to  p(S,£)(or  p(S,£))for  judging  the  efficiency  of  an  esti- 

2 

mated  filter.  Note  that  if  f  is  not  large  compared  to  p,  then  D  overestimates 

P 

the  efficiency  of  discrimination. 

The  formula  (5.8)  is  also  useful  in  examining  the  gain  in  discrimination 
efficiency  by  increasing  the  number  of  features.  For  instance,  the  estimated 
signal  to  noise  ratio  with  a  subset  of  r  features  out  of  p,  represented  by  a 
vector  6^  is 


(f-r+2) (f-r-3) 
f(f+l) 


(5.9) 


2  -1  *  -1 

where  (°r  with  as  the  partition  of  S  arising  out 

2  2 

of  the  first  r  columns  and  r  rows.  If  p  >  r,  then  but  (5.9)  may  be 

>  or  <  or  =  (5.8),  and  an  appropriate  decision  may  be.  taken  depending  on  the  actual 
relationship.  It  is  possible  that  with  an  estimated  S,  the  inclusion  of  a  large 
number  of  features  may  decrease  the  discrimination  efficiency,  a  phenomenon 
observed  in  several  multivariate  situations  (see  Rao  (1971)). 

A  more  satisfactory  approach  is  to  determine  a  confidence  bound  for  p(S.E), 

(or  p(S,E))  in  terms  of  known  quantities.  This  is  done  by  using  the  distribu¬ 
tion  derived  in  Theorem  3  of  Section  2. 

From  (5.4) 

Y  =  P-(-S-*-p-  '  B,  (——?—■,  ^),  lor  Y  =  ~  B  (f-p+2, p-1)  (5.10) 

6  E  6  I 
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as  beta  variables,  and 


X 


6,I~1(S 

S'S-15 


f~P+1> 
2  ’ 


*  -1 

6  Z  3 

*  -1 
6  S  <5 


G(l,f-p+l)j 


(5.11) 


as  gamma  variables  using  the  notation  of  Rao  (1973,  p.  164),  and,  further,  X 
and  Y  (or  X  and  Y)  are  independent.  Then  from  Theorem  3 


-  I  w  -  I  p(S’L) 

/j  AI  o  « 

Z  DZ 
P 


or  Z 


D 

P 


(5.12) 


2  -1  *  -1 

where  D  =  fd'S  <5  (or  f<5  S  5),  has  the  confluent  hypergeometric  distribution  (' 


1  -z  m-1  r(a+m-c+l) 

e  z  - rvr-  V  (a,c;  z) 


T  (m) 


T (m-c+1) 


(5.13) 


which  is  independent  of  the  unknown  Z  with 


m 


=  — ,  a  =  c  =  -y,  (or  m  =  f-p+1 ,  a  =  p-1,  c  *  0)  . 


(5.14) 


If  z^  (or  z^)  is  the  lower  a  %  point  of  the  distribution,  then 


2z 


a  „2. 


P(p(S,Z)  D  )  =  1-a 


*•  r\ 

lor  P(p(S,Z)  >  t2-  DZ)  =  1-a 
“  f  P 


(5.15) 


so  that 


p(S,Z) 


z  J 

or  p (S  ,Z)  >  -r—  D 
-  f  p 


(5.16) 


provides  a  lower  bound  to  the  realized  signal  to  noise  ratio  at  a  confidence 
level  of  (l-a)%. 

The  equation  satisfied  by  z^  is 


.19) 


J, 


r(^i)r(£^±2) 


r(f-p)/2(1_y)(p-3)/2dyl 

y  o 


rc^fi) 


-x  (f-p-l)/2 
ex  ax 


and  z  is 
a 


r(f+n 


r(p-i)r(f-p+2) 


f-p+l>-  .p-2  /^c/y 

y  v  (1-y)  dyj 


r(f-p+i> 


-x  f  ~p 
e  x  rdx. 


The  values  of  z^  (or  2^)  can  be  found  by  a  suitable  computer  algorithm.  For 
instance,  the  multiplying  coefficients  (see  5.16)  for  the  observed  Mahalanobis 
distance  to  provide  50%  and  95%  lower  confidence  bounds  to  the  realized  signal 
to  noise  ratio  are  given  below  for  p  =  4  and  f  «■  8,  12  and  16. 


2z  If 
a 

2 

a 

/f 

f 

50% 

95% 

50% 

95% 

8 

.345 

.075 

.381 

.141 

12 

.525 

.188 

.553 

.283 

16 

.631 

.281 

.649 

.377 

Detailed  tables  will  ap^ea1-  in  a  later  communication. 

6.  LOSS  DUE  TO  ESTIMATION  OF  E  IN  MULTIPLE  DISCRIMINATION 
Consider  the  nrohlem  of  identifying  a  received  message  as  noise  or  one  of 
r  possible  signals  6^ , .  . .  , <5^,  which  we  represent  by  a  p  x  r  matrix  A  =  (6^ : . . .  :  5^)  . 
Further,  let  X  be  a  vector  of  observed  features  with  covariance  matrix  E  and 
E(X)  -  when  the  i-th  signal  is  transmitted,  i  -  l,...,r  and  E(X)  -  0  for 
noiae.  Then  the  overall  efficiency  of  discrimination  using  X  can  be  judged  by 


a  function  of  the  eiaen  values  of 
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* 

AA'  (or  AA  )  with  respect  to  I  (6.1) 

which  are  the  same  as  the  eigen  values  of 

A'Z_1A  (or  A*I_1A) .  (6.2) 

-1  *  -1 

This  provides  a  generalization  of  the  signal  to  noise  ratio  5'E  5  (or  6  1  6) 

in  the  case  of  a  single  signal. 

If  the  noise  has  N  (0,1)  distribution,  then  the  decision  function  for  the 
P 

detection  of  r  signals  is  based  on  the  sufficient  statistics 

6:i_1X  (or  6*Z_1X)  ,  i  =  l,...,r  (6.3) 

l  l 

-1  *  -1 

which  can  be  written  as  the  discriminant  vector  Y  =  A'E  X  (or  A  S  X)  with  the 
covariance  matrix  A'E  1A  (or  A  I  ^A) ,  and  E(Y)  =  A'E  ^5^  (or  A  E  ^6^)  for  the 
i-th  signal.  The  efficiency  of  discrimination  in  using  Y  instead  of  X,  using 
the  formula  (6.1)  depends  on  the  eigen  values  of 

(A ' Z_1A) (A ' E-1A) _1 (A ' E-1A)  with  respect  A'Z-1A  (6. A) 

* 

(or  with  A  in  the  place  of  A),  which  are  the  same  as  those  for  X  as  expected. 

If  Z  is  not  known  but  an  estimate  f  ‘'“S  is  available,  then  the  estimated  discrim¬ 
inant  vector  is 

Y  =  A’S_1X  (or  A*S_1X)  (6.5) 

and  its  efficiency  depends  on  th°  eigen  values  of 

B  =  (A 'S_1A) (A 'S-1£S-1A)_1A'S-1A  (6.6) 

•ff 

(or  with  S'  replaced  by  S  ) ,  which  is  a  generalization  of  p(S,£),  (or  p(S,E)) 


as  considered  in  (5.1). 
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In  Theorem  1,  we  found  the  distribution  of  the  matrices  (A'S_ZA)~'''  and 

“1  ^  “1  %  AIT 

(A'Z  A)  B(A'E  A) ^  in  the  case  of  real  variables,  and  of  the  matrices  (A  S-iA)- 
*  -i  %  *  -1  J< 

and  (A  E  A)  B(A  E  A)  in  the  complex  case.  We  use  these  distributions  in  ex¬ 
amining  the  realized  efficiency  through  the  estimated  discriminant  vector. 

For  this  purpose,  we  consider  two  particular  functions  of  the  eigen  values 
of  B ,  one  of  which  is  the  sum 

Z  =  tr  B  =  tt  [  (A'S-1A)2CA'S~1ES_1A)-1] 


E<5  :S-1A(A'S-1ES"‘1A)“1A  'S-1S. 
1  1 


(6.7; 


(.or  with  <5^  and  A'  replaced  by  5^  and  A  ),  and  another  is  the  product 


z  .  |,| .  -iA's:laii-,  l„ 

A'S“1ES"1A  | A  S  1ZS  1A| 


(6.8) 


Using  Theorem  2 


ECZi}  =  5iz'l5i  (or  6iz‘l6i^ 


[  tr(A’E_1A)  (or  A*E_1A)  ] 


(6.9) 


e(z2)  =  [  n  ](|A’E_1A|2  or  !a*e  :A|2j. 


(6.10) 


The  formulas  (6.9)  and  (6.10)  enable  us  to  choose  a  suitable  value  of  f  for  given 


p  and  r  to  keep  the  average  loss  at  a  desired  level. 
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